We conduct a systematic study of backdoor vulnerabilities in normally trained Deep Learning models. They are as dangerous as backdoors injected by data poisoning because both can be equally exploited. We leverage 20 different types of injected backdoor attacks in the literature as the guidance and study their correspondences in normally trained models, which we call natural backdoor vulnerabilities. We find that natural backdoors are widely existing, with most injected backdoor attacks having natural correspondences. We categorize these natural backdoors and propose a general detection framework. It finds 315 natural backdoors in the 56 normally trained models downloaded from the Internet, covering all the different categories, while existing scanners designed for injected backdoors can at most detect 65 backdoors. We also study the root causes and defense of natural backdoors.
translated by 谷歌翻译
High-quality traffic flow generation is the core module in building simulators for autonomous driving. However, the majority of available simulators are incapable of replicating traffic patterns that accurately reflect the various features of real-world data while also simulating human-like reactive responses to the tested autopilot driving strategies. Taking one step forward to addressing such a problem, we propose Realistic Interactive TrAffic flow (RITA) as an integrated component of existing driving simulators to provide high-quality traffic flow for the evaluation and optimization of the tested driving strategies. RITA is developed with fidelity, diversity, and controllability in consideration, and consists of two core modules called RITABackend and RITAKit. RITABackend is built to support vehicle-wise control and provide traffic generation models from real-world datasets, while RITAKit is developed with easy-to-use interfaces for controllable traffic generation via RITABackend. We demonstrate RITA's capacity to create diversified and high-fidelity traffic simulations in several highly interactive highway scenarios. The experimental findings demonstrate that our produced RITA traffic flows meet all three design goals, hence enhancing the completeness of driving strategy evaluation. Moreover, we showcase the possibility for further improvement of baseline strategies through online fine-tuning with RITA traffic flows.
translated by 谷歌翻译
Since the recent success of Vision Transformers (ViTs), explorations toward transformer-style architectures have triggered the resurgence of modern ConvNets. In this work, we explore the representation ability of DNNs through the lens of interaction complexities. We empirically show that interaction complexity is an overlooked but essential indicator for visual recognition. Accordingly, a new family of efficient ConvNets, named MogaNet, is presented to pursue informative context mining in pure ConvNet-based models, with preferable complexity-performance trade-offs. In MogaNet, interactions across multiple complexities are facilitated and contextualized by leveraging two specially designed aggregation blocks in both spatial and channel interaction spaces. Extensive studies are conducted on ImageNet classification, COCO object detection, and ADE20K semantic segmentation tasks. The results demonstrate that our MogaNet establishes new state-of-the-art over other popular methods in mainstream scenarios and all model scales. Typically, the lightweight MogaNet-T achieves 80.0\% top-1 accuracy with only 1.44G FLOPs using a refined training setup on ImageNet-1K, surpassing ParC-Net-S by 1.4\% accuracy but saving 59\% (2.04G) FLOPs.
translated by 谷歌翻译
时空预测学习旨在通过从历史框架中学习来产生未来的帧。在本文中,我们研究了现有方法,并提出了时空预测学习的一般框架,其中空间编码器和解码器捕获框架内特征和中间时间模块捕获框架间相关性。尽管主流方法采用经常性单元来捕获长期的时间依赖性,但由于无法可行的架构,它们的计算效率低。为了使时间模块并行,我们提出了时间注意单元(TAU),该单元将时间关注分解为框内静态注意力和框架间动力学注意力。此外,虽然平方误差损失侧重于框架内错误,但我们引入了一种新颖的差异差异正则化,以考虑框架间的变化。广泛的实验表明,所提出的方法使派生模型能够在各种时空预测基准上实现竞争性能。
translated by 谷歌翻译
普遍的后门是由动态和普遍的输入扰动触发的。它们可以被攻击者故意注射,也可以自然存在于经过正常训练的模型中。它们的性质与传统的静态和局部后门不同,可以通过扰动带有一些固定图案的小输入区域来触发,例如带有纯色的贴片。现有的防御技术对于传统后门非常有效。但是,它们可能对普遍的后门无法正常工作,尤其是在后门去除和模型硬化方面。在本文中,我们提出了一种针对普遍的后门,包括天然和注射后门的新型模型硬化技术。我们基于通过特殊转换层增强的编码器架构来开发一般的普遍攻击。该攻击可以对现有的普遍后门攻击进行建模,并通过类距离进行量化。因此,使用我们在对抗训练中攻击的样品可以使模型与这些后门漏洞相比。我们对9个具有15个模型结构的9个数据集的评估表明,我们的技术可以平均扩大阶级距离59.65%,精度降解且没有稳健性损失,超过了五种硬化技术,例如对抗性训练,普遍的对抗训练,Moth,Moth等, 。它可以将六次普遍后门攻击的攻击成功率从99.06%降低到1.94%,超过七种最先进的后门拆除技术。
translated by 谷歌翻译
有条件的生成模型旨在学习数据和标签的基础联合分布,以实现有条件的数据生成。其中,辅助分类器生成的对抗网络(AC-GAN)已被广泛使用,但遭受了生成样品的阶层内多样性的问题。本文指出的基本原因是,AC-GAN的分类器是生成器 - 静脉器,因此不能为发电机提供接近联合分布的信息指导,从而最小化条件熵,从而减少了阶级内的阶级。多样性。在这种理解的推动下,我们提出了一个具有辅助判别分类器(ADC-GAN)的新型条件gan,以解决上述问题。具体而言,提出的辅助判别分类器通过识别真实数据的类标签和生成的数据而成为生成器感知。我们的理论分析表明,即使没有原始歧视者,发电机也可以忠实地学习联合分布,从而使拟议的ADC-GAN可靠,可适应该系数超参数的价值和GAN损失的选择,并在训练过程中稳定。关于合成和现实世界数据集的广泛实验结果表明,与基于最新的分类器和基于基于投影的条件gan相比,有条件生成建模中ADC-GAN的优势。
translated by 谷歌翻译
单击后键盘转换为指示用户偏好的强信号,是构建推荐系统的良性。但是,由于选择偏差,即,观察到的单击事件通常会发生在用户的首选项目,准确地估计点击后击率(CVR)是具有挑战性的。目前,大多数现有方法利用反事实学习到Debias推荐系统。其中,双重稳健(DR)估计器通过以双重稳健的方式组合基于误差估算的(EIB)估计和逆倾向分数(IPS)估计来实现竞争性能。然而,不准确的误差估算可能导致其比IPS估计器更高的方差。更糟糕的是,现有方法通常使用简单的模型 - 不可知方法来估计归纳错误,这不足以近似于近似于动态改变的模型相关目标(即预测模型的梯度方向)。为了解决这些问题,我们首先导出DR估算器的偏差和方差。基于它,已经提出了一种更强大的双重稳健(MRDR)估计器,以进一步降低其差异,同时保持其双重稳健性。此外,我们为MRDR估算器提出了一种新的双重学习方法,可以将误差归纳转换为一般的CVR估计。此外,我们经验验证所提出的学习方案可以进一步消除估算学习的高方差问题。为了评估其有效性,在半合成数据集和两个现实世界数据集上进行了广泛的实验。结果证明了所提出的方法的优越性在最先进的方法中。代码可在https://github.com/guosyjlu/mrdr-dl上获得。
translated by 谷歌翻译
Panoptic Part Segmentation (PPS) unifies panoptic segmentation and part segmentation into one task. Previous works utilize separated approaches to handle thing, stuff, and part predictions without shared computation and task association. We aim to unify these tasks at the architectural level, designing the first end-to-end unified framework named Panoptic-PartFormer. Moreover, we find the previous metric PartPQ biases to PQ. To handle both issues, we make the following contributions: Firstly, we design a meta-architecture that decouples part feature and things/stuff feature, respectively. We model things, stuff, and parts as object queries and directly learn to optimize all three forms of prediction as a unified mask prediction and classification problem. We term our model as Panoptic-PartFormer. Secondly, we propose a new metric Part-Whole Quality (PWQ) to better measure such task from both pixel-region and part-whole perspectives. It can also decouple the error for part segmentation and panoptic segmentation. Thirdly, inspired by Mask2Former, based on our meta-architecture, we propose Panoptic-PartFormer++ and design a new part-whole cross attention scheme to further boost part segmentation qualities. We design a new part-whole interaction method using masked cross attention. Finally, the extensive ablation studies and analysis demonstrate the effectiveness of both Panoptic-PartFormer and Panoptic-PartFormer++. Compared with previous Panoptic-PartFormer, our Panoptic-PartFormer++ achieves 2% PartPQ and 3% PWQ improvements on the Cityscapes PPS dataset and 5% PartPQ on the Pascal Context PPS dataset. On both datasets, Panoptic-PartFormer++ achieves new state-of-the-art results with a significant cost drop of 70% on GFlops and 50% on parameters. Our models can serve as a strong baseline and aid future research in PPS. Code will be available.
translated by 谷歌翻译
Rankings are widely collected in various real-life scenarios, leading to the leakage of personal information such as users' preferences on videos or news. To protect rankings, existing works mainly develop privacy protection on a single ranking within a set of ranking or pairwise comparisons of a ranking under the $\epsilon$-differential privacy. This paper proposes a novel notion called $\epsilon$-ranking differential privacy for protecting ranks. We establish the connection between the Mallows model (Mallows, 1957) and the proposed $\epsilon$-ranking differential privacy. This allows us to develop a multistage ranking algorithm to generate synthetic rankings while satisfying the developed $\epsilon$-ranking differential privacy. Theoretical results regarding the utility of synthetic rankings in the downstream tasks, including the inference attack and the personalized ranking tasks, are established. For the inference attack, we quantify how $\epsilon$ affects the estimation of the true ranking based on synthetic rankings. For the personalized ranking task, we consider varying privacy preferences among users and quantify how their privacy preferences affect the consistency in estimating the optimal ranking function. Extensive numerical experiments are carried out to verify the theoretical results and demonstrate the effectiveness of the proposed synthetic ranking algorithm.
translated by 谷歌翻译
In this work, we focus on instance-level open vocabulary segmentation, intending to expand a segmenter for instance-wise novel categories without mask annotations. We investigate a simple yet effective framework with the help of image captions, focusing on exploiting thousands of object nouns in captions to discover instances of novel classes. Rather than adopting pretrained caption models or using massive caption datasets with complex pipelines, we propose an end-to-end solution from two aspects: caption grounding and caption generation. In particular, we devise a joint Caption Grounding and Generation (CGG) framework based on a Mask Transformer baseline. The framework has a novel grounding loss that performs explicit and implicit multi-modal feature alignments. We further design a lightweight caption generation head to allow for additional caption supervision. We find that grounding and generation complement each other, significantly enhancing the segmentation performance for novel categories. We conduct extensive experiments on the COCO dataset with two settings: Open Vocabulary Instance Segmentation (OVIS) and Open Set Panoptic Segmentation (OSPS). The results demonstrate the superiority of our CGG framework over previous OVIS methods, achieving a large improvement of 6.8% mAP on novel classes without extra caption data. Our method also achieves over 15% PQ improvements for novel classes on the OSPS benchmark under various settings.
translated by 谷歌翻译